Modeling Molecular Mass
2025-08-02
\[\gamma_{API}=\frac{141.5}{\gamma_{o}}-131.5\]
| Symbol | Meaning | Units |
|---|---|---|
| \(\gamma_{API}\) | API Gravity | [degAPI] |
| \(\gamma_o\) | specific gravity | [1/wtr] |
from pysr import PySRRegressor()
myMod=PySRRegressor()
myMod.fit(x.reshape(-1, 1),y)
y_pred=myMod.predict(x.reshape(-1, 1))
myEq=myMod.sympy()
\(x_0-(x_0+0.013196754)+1.0131962+ \frac{x_0 (-132.5)- -141.5}{x_0}\)
sym.simplify(myEq)
\(-131.500000554+\frac{141.5}{x_0}\)
w=(random.rand(21)-0.5)*2.5
\[-132.05688+\frac{141.88547}{x_0}\]
binary_operators=["+","-","*","/"]
unary_operators=None
maxsize=30
| 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|
| minus | divide | 141.5 | specific gravity | 131.5 |
maxdepth=None
elemtwise_loss="L2DistLoss()"
\[ \frac{1}{n} \sum_{k=1}^{n}{\left(\hat{y}_k-y_k \right)}^2\]
\[ \frac{1}{n} \sum_{k=1}^{n}{\left|\hat{y}_k-y_k \right|}\]
parsimony=0.0
model_selection="best"
should_simplify=True
Tournament_selection_n=15
niterations=100
max_evals=None
timeout_in_seconds=None
populations=31
population_size=27
\[ \frac{a_{00}}{x}+a_{01}\]
w<-seq(from=121.5,to=161.5, by=10)
<expr> ::= <op>(<expr>, <expr>) | <var> | <con>
<op> ::= "+" | "-" | "*" | "/"
<var> ::= x
<con> ::= 121.5 | 131.5 | 141.5 | 151.5 | 161.5
Grammatical Evolution Search Results:
No. Generations: 1531
Best Expression: 141.5/x - 131.5
Best Cost: 0
| Variable | Description | Designation |
|---|---|---|
| \(Mw\) | Molecular Mass | dependent |
| \(SG\) | Specific Gravity | independent |
| \(TBP\) | True Boiling Point | independent |
SG TBP MW
Min. :0.6310 Min. : 306.0 Min. : 70.0
1st Qu.:0.7439 1st Qu.: 373.2 1st Qu.: 99.0
Median :0.8474 Median : 584.0 Median : 222.5
Mean :0.8470 Mean : 575.2 Mean : 304.6
3rd Qu.:0.8784 3rd Qu.: 668.5 3rd Qu.: 297.8
Max. :1.3793 Max. :1012.0 Max. :1685.0
Raw Molecular Mass Histogram
Raw Molecular Mass Box-and-Whiskers Plot
Raw Boiling Point Box-and-Whiskers Plot
Raw Boiling Point Histogram
Raw Specific Gravity Histogram
Raw Specific Gravity Box-and-Whiskers Plot
Molecular Mass vs Specific Gravity Goossens Data
Molecular Mass vs Boiling Point Goossens Data
Specific Gravity vs Boiling Point Goossens Data
Molecular Mass vs Boiling Point. Both Datasets
Molecular Mass vs Specific Gravity. Both Datasets
| Symbol | Meaning |
|---|---|
| \(M_w\) | Apparent Molecular Mass |
| \(T_b\) | True Boiling Point Temperature |
| \(\gamma_o\) | Specific Gravity |
| \(a_{00}..a_{09}\) | Empirical Constants |
| \(K_w\) | Characterization Factor (intermediate value) |
| \(X_0...X_3\) | Intermediate Variables |
Hariu & Sage (1969)
\[ M_w = a_{00} + a_{01} K_w + a_{02} K_w^2 + a_{03} T_b K_w + a_04 T_b K_w^2 + a_{05} T_b^2 K_w + a_{06} T_b^2 K_w^2 \]
\[K_w =\frac{\sqrt[3]T_b}{\gamma_o}\]
Kesler & Lee (1976)
\[M_w = X_0 + \frac{X_1}{T_b} + \frac{X_2}{T_b^2}\]
\[X_0 = a_{00} + a_{01} γ_o+ \left (a_{02} + a_{03} γ_o \right ) T_b\]
\[ X_1 = \left (1+ a_{04} γ_o + a_{05}γ_o^2 \right ) \left (a_{06} + \frac{a_ {07}}{T_b} \right ) \cdot 10^7 \]
\[ X_2 = \left (1+ a_{08} γ_o+ a_{09} γ_o^2 \right ) \left (a_{10} + \frac{a_{11}}{T_b} \right ) \cdot 10^{12} \]
American Petroleum Institute (1977)
\[ M_w = a_{00} e^ {\left (a_{01} T_b \right )} e^{\left (a_{02} γ_o \right )} T_b^{a_{03}} γ_o^ {a_{04}} \]
Winn, Sim & Daubert (1980)
\[M_w = a_{00} T_b^{a_ {01}} γ_o^{a_{02}}\]
Riazi & Daubert (1980)
\[M_w = a_{00} T_b^{a_ {01}}γ_o^{a_{02}}\]
Rao & Bardon (1985)
\[ln {M_w} = (a_{00} + a_{01} K_w) ln (\frac{T_b} {a_{02} + a_{03} K_w} )\]
Riazi & Daubert (1987)
\[ M_w = a_{00} T_b^{a_{01}} γ_o^{a_{02}} e^{\left (a_{03} T_b + a_{04} γ_o + a_{05} T_b γ \right )} \]
Goossens (1996)
\[M_w = a_{00} T_b^{X_0}\]
\[ X_0 =\frac {a_{03} + a_{04} ln {\left (\frac{T_b} {a_{05} - T_b} \right )}} {a_{01} γ_o + a_{02}} \]
Linan (2011)
\[ M_w = a_{00} e^{\left (a_{01} T_b \right )} e^{\left (a_{02} γ_o \right )} T_b^ {a_{03}} γ_o^{a_{04}} \]
Hosseinifar & Shahverdi (2021)
\[M_w = {\left [a_{00} T_b^{a_{01}} {\left (\frac{3+2γ_o} {3-γ_o} \right )}^{\frac{a_{02}}{2}} + a_{03} T_b^{a_{04}} {\left (\frac{3+2γ_o}{3-γ_o} \right )}^{\frac{a_{05}}{2}} \right ]}^{a_{06}}\]
Stratiev (2023)
\[ M_w = a_{00} + a_{01} e^{\left [a_{02} e^{\left (a_{03} \frac{T_b^{a_{06}}}{γ_o^{a_{05}}} \right )} \right ]} \]
| Operator | Type | Description |
|---|---|---|
| pow | binary | one expression raised to the power of another |
| log | unary | logarithm of an expression |
| exp | unary | antilogarithm of an expression |
| sqr | unary | expression squared |
| cub | unary | expression cubed |
| inv | unary | inverse of an expression |
\[ M_w=a_{00}+\frac{a_{01}}{\gamma_o-a_{02}}+\frac{T_b\cdot (a_{03}\cdot T_b-a_{04})}{\gamma_o\cdot (a_{05}-\frac{a_{06}}{T_b})} \]
| Raw MW | Goossens Equation |
This Equation |
|
|---|---|---|---|
| Raw MW | 1.000000 | 0.999711 | 0.999847 |
| Goossens Equation | 0.999711 | 1.000000 | 0.999798 |
| This Equation | 0.999847 | 0.999798 | 1.000000 |
\[M_w=a_{00} \cdot a_{01}^{\gamma_o^{-a_{02}} + a_{03} \cdot T_b} + a_{04}\]
| Raw MW | This Equation | |
|---|---|---|
| Raw MW | 1.000000 | 0.997281 |
| This Equation | 0.997281 | 1.000000 |
\[ M_w= - T_b \cdot \left(a_{00} \cdot T_b - a_{01}\right) \left(a_{02} \cdot 10^{-6} \left(a_{03} \cdot \gamma_o - 2 \cdot T_b \right) \left(T_b - a_{04}\right) - 1\right) + a_{05}\]
\[ M_w= a_{00}\cdot T_b + a_{01} \cdot e^{- \gamma_o^{2} + a_{02}\cdot \gamma_o + a_{03} \cdot T_b}\]
| Raw MW | 1st Equation |
2nd Equation |
|
|---|---|---|---|
| Raw MW | 1.000000 | 0.997705 | 0.998420 |
| First Equation | 0.997705 | 1.000000 | 0.999497 |
| Second Equation | 0.998420 | 0.999497 | 1.000000 |
Raw Data Scale Change
\[ M_w=(a_{00}\cdot T_b-a_{01})e^{a_{02}\cdot 10^{-9}T_b^2(a_{03}\cdot \gamma_0 +a_{04}\cdot T_b)} \]
\[ M_w=-a_{00}\cdot \gamma_o+\frac{e^{-\gamma_o^2+{log(T_b)}^3}}{T_b^{a_{01}}} +a_{02}\cdot T_b \]
| Raw Mw | First Equation | Second Equation | |
|---|---|---|---|
| Raw Mw | 1.000000 | 0.998880 | 0.999324 |
| First Equation | 0.998880 | 1.000000 | 0.998851 |
| Second Equation | 0.999324 | 0.998851 | 1.000000 |
\(\lambda=-0.3624\)
Shapiro-Wilk p-value \(0.0507\)
\[ M_w^t = \left (\gamma_o \left(a_{00} T_b -a_{01} \right) log{(\gamma_o)} + a_{02} \right) log{(T_b)} \]
Reverse the Box-Cox transform:
\[ M_w = {\left(\lambda M_w^t +1 \right)}^{\frac{1}{\lambda}} \]
| Raw Mw | This Equation | |
|---|---|---|
| Raw Mw | 1.000000 | 0.995897 |
| This Equation | 0.995897 | 1.000000 |
\[ M_w=-a_{00} \cdot \gamma_o +\frac{a_{01}\cdot \gamma_o}{T_b}+a_{02}\cdot T_b -a_{03}\]
| Raw Mw | This Equation | |
|---|---|---|
| Raw Mw | 1.000000 | 0.970451 |
| This Equation | 0.970451 | 1.000000 |
\[ M_w=-a_{00}\cdot \gamma_o\cdot T_b +a_{01}\cdot T^2_b +a_{02}\cdot e^{\gamma_o} +a_{03}\]
| Raw Mw | ||
|---|---|---|
| Raw Mw | 1.000000 | |
| 1st Run Equation | 0.970451 | |
| 2nd Run Equation | 0.985800 |
| Run | Correlation Coefficient |
|---|---|
| Raw Mw | 1.000000 |
| Default | 0.999957 |
| Goossens Correlation | 0.999939 |
| Power | 0.996964 |
| Exponential #2 | 0.998954 |
| Aeon #1 | 0.999973 |
| Aeon #2 | 0.999921 |
| Box-Cox | 0.999696 |
| Sparse #1 | 0.992303 |
| Sparse #2 | 0.996331 |